Collective Classification of Posts to Internet Forums

نویسندگان

  • Pádraig Ó Duinn
  • Derek G. Bridge
چکیده

We investigate automatic classification of posts to Internet forums. We use collective classification methods, which simultaneously classify related objects — in our case, the posts in a thread. Specifically, we compare the Iterative Classification Algorithm (ICA) with Conditional Random Fields and with conventional classifiers (k-Nearest Neighbours and Support Vector Machines). The ICA algorithm invokes a local classifier, for which we use the kNN classifier. Our main contributions are two-fold. First, we define experimental protocols that we believe are suitable for offline evaluation in this domain. Second, by using these protocols to run experiments on two datasets, we show that ICA with kNN has significantly higher accuracy across most of the experimental conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collective Stance Classification of Posts in Online Debate Forums

Online debate sites are a large source of informal and opinion-sharing dialogue on current socio-political issues. Inferring users’ stance (PRO or CON) towards discussion topics in domains such as politics or news is an important problem, and is of utility to researchers, government organizations, and companies. Predicting users’ stance supports identification of social and political groups, bu...

متن کامل

Recognition of Sentiment Sequences in Online Discussions

Currently 19%-28% of Internet users participate in online health discussions. In this work, we study sentiments expressed on online medical forums. As well as considering the predominant sentiments expressed in individual posts, we analyze sequences of sentiments in online discussions. Individual posts are classified into one of the five categories encouragement, gratitude, confusion, facts, an...

متن کامل

The Psychology of Word Use in Depression Forums

The present studies demonstrate two computerized approaches to examining the expression of depression on the Internet. Study 1 observed linguistic markers of depression in English and Spanish forums. English and Spanish posts by depressed (N=160) and non-depressed individuals (N=160) were collected from Internet forums using bulletin board systems (bbs). A computer program (LIWC2001) was used t...

متن کامل

Classification of Online Health Discussions with Text and Health Feature Sets

Nowadays, many health groups and forums are established on the Internet, where health consumers discuss health issues and interact with each other. Although there is a large amount of user generated content about healthcare on different social media sites, few studies have applied data mining or artificial intelligence techniques for knowledge discovery on a large scale of data in this particul...

متن کامل

The Psychology of Word Use in Depression Forums in English and in Spanish: Texting Two Text Analytic Approaches

The present studies demonstrate two computerized approaches to examining the expression of depression on the Internet. Study 1 observed linguistic markers of depression in English and Spanish forums. English and Spanish posts by depressed (N=160) and non-depressed individuals (N=160) were collected from Internet forums using bulletin board systems (bbs). A computer program (LIWC2001) was used t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014